Skip to content

Project 2: Kushagra #21

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

Kushagra-Goel
Copy link

@Kushagra-Goel Kushagra-Goel commented Sep 18, 2019

  • Repo Link
  • Stream Compaction
    • CPU Scan
    • Naive Scan
    • Work Efficient Scan with thread optimization (Extra Credit)
      • Used different indexing scheme to reduce the number of threads that don't do anything
      • Increased number of blocks as the depth increases
    • CPU Compaction
      • Without Scan
      • With Scan
    • GPU Compaction with Work Efficient scan
    • Radix Sort (Extra Credit)
  • Character Recognition
    • MultiLayer Perceptron
      • MLP can be run with any number of layers, it is not a fixed architecture. Only the middle layer activation(ReLU) and final layer activation (Softmax) are fixed (Extra Credit?)
      • Dynamically calculates the gradients instead of having a fixed formula
      • Converged on XOR, Character Dataset
      • Converged on MNIST (Extra Credit)
      • Batch processing instead of individual (Extra Credit?)

I am using 1 bonus day for this assignment. It was majorly used for the readme guidelines.

Kushagra-Goel and others added 18 commits September 15, 2019 22:35

<a name = "output"/>
## Sample Output
```
****************
** SCAN TESTS **
****************
    [  23  47  49  19  49  18  42  24  39  24   0  23  17 ...   9   0 ]
==== cpu scan, power-of-two ====
   elapsed time: 4.554ms    (std::chrono Measured)
    [   0  23  70 119 138 187 205 247 271 310 334 334 357 ... 25650262 25650271 ]
==== cpu scan, non-power-of-two ====
   elapsed time: 1.5641ms    (std::chrono Measured)
    [   0  23  70 119 138 187 205 247 271 310 334 334 357 ... 25650190 25650225 ]
    passed
==== naive scan, power-of-two ====
   elapsed time: 5.29328ms    (CUDA Measured)
    passed
==== naive scan, non-power-of-two ====
   elapsed time: 5.14387ms    (CUDA Measured)
    passed
==== work-efficient scan, power-of-two ====
   elapsed time: 1.35523ms    (CUDA Measured)
    passed
==== work-efficient scan, non-power-of-two ====
   elapsed time: 1.34246ms    (CUDA Measured)
    passed
==== thrust scan, power-of-two ====
   elapsed time: 3.9569ms    (CUDA Measured)
    passed
==== thrust scan, non-power-of-two ====
   elapsed time: 3.39987ms    (CUDA Measured)
    passed

*****************************
** STREAM COMPACTION TESTS **
*****************************
    [   1   3   3   3   1   2   2   0   1   0   2   1   1 ...   3   0 ]
==== cpu compact without scan, power-of-two ====
   elapsed time: 2.5452ms    (std::chrono Measured)
    [   1   3   3   3   1   2   2   1   2   1   1   3   3 ...   1   3 ]
    passed
==== cpu compact without scan, non-power-of-two ====
   elapsed time: 2.6496ms    (std::chrono Measured)
    [   1   3   3   3   1   2   2   1   2   1   1   3   3 ...   1   2 ]
    passed
==== cpu compact with scan ====
   elapsed time: 9.8439ms    (std::chrono Measured)
    [   1   3   3   3   1   2   2   1   2   1   1   3   3 ...   1   3 ]
    passed
==== work-efficient compact, power-of-two ====
   elapsed time: 6.4849ms    (CUDA Measured)
    passed
==== work-efficient compact, non-power-of-two ====
   elapsed time: 6.31091ms    (CUDA Measured)
    passed
Press any key to continue . . .
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant